-
Notifications
You must be signed in to change notification settings - Fork 374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add eth_batchCall
#312
base: main
Are you sure you want to change the base?
Add eth_batchCall
#312
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My implementation of this (slightly different, but not totally) from 1.5 years ago:
https://github.com/Zoltu/nethermind-plugin-multicall/blob/main/source/MulticallModule.cs#L29
stateOverrides: | ||
title: State overrides | ||
$ref: '#/components/schemas/StateOverrides' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels out of place here. What is the reasoning for having this as a parameter? It also greatly complicates implementation in some clients I believe compared to not including it so I would like to see a compelling reason for its inclusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a greatly used feature of eth_call. I agree it is less necessary in multicall since you can build up the state in previous calls. However not all state mutations are possible (e.g. simply replacing a contract code).
I agree this is extra work for client devs. However I would like to hear from them if this is indeed a complication. It seems to me all have a way to build a temporary state for serving eth_call. This goes a step further to modify parts of that state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the state override feature is probably useful enough to warrant the complexity. I can imagine a security researcher breaking a problem down into two parts. If this value ever can become X, there is a vulnerability to report. They could research that aspect first (as it might be simpler) then research "can this value ever become X".
Could also fake signers of a multi-sig that use store approvals (as many do).
error: | ||
title: Execution error | ||
type: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does string
here mean? I believe revert reason is just a byte array. While Solidity has some helpers like require(..., message)
that make it easy to encode a string in those bytes, there are some contracts out there that return error codes as numbers, and it could even contain data arrays.
I think this should be bytes
, and it is left up to the recipient to decode that as appropriate for the contract in question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify this is not directly evm revert error. It encompasses other errors such as OOG. If call reverts the error will be "execution reverted" and the return data
field will contain the bytes
that you mentioned.
Happy to change the description of the field if it is confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to indicate that in the case of an EVM revert, return
will contain the revert return bytes.
Is there a way in this schema to express that error
is optional (may not be defined in the resulting object)?
src/schemas/execute.yaml
Outdated
properties: | ||
number: | ||
title: Number | ||
$ref: '#/components/schemas/uint256' | ||
difficulty: | ||
title: Difficulty | ||
$ref: '#/components/schemas/uint256' | ||
time: | ||
title: Time | ||
$ref: '#/components/schemas/uint256' | ||
gasLimit: | ||
title: Gas limit | ||
$ref: '#/components/schemas/uint64' | ||
coinbase: | ||
title: Coinbase | ||
$ref: '#/components/schemas/address' | ||
random: | ||
title: Random | ||
$ref: '#/components/schemas/hash32' | ||
baseFee: | ||
title: Base fee | ||
$ref: '#/components/schemas/uint256' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a mechanism for indicating optionality? I feel like all of these should be optional and have reasonable defaults provided by the execution client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also feel like these should be constrained to values that otherwise pass validation so clients don't have to write code that skips over any more validation than necessary (skipping block signature for example is necessary). For example, block number must be monotonically increasing, time must be strictly greater than previous block, gasLimit cannot change more than 1/1024 from previous block, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a mechanism for indicating optionality? I feel like all of these should be optional and have reasonable defaults provided by the execution client.
The default value will be taken from the block
the calls will be executed on top of.
I also feel like these should be constrained to values that otherwise pass validation so clients don't have to...
Is this still a concern now that there's one set of block overrides for the whole batch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be clearly indicated in this specification, perhaps in a description
field? Some of them cannot be defaulted to the parent block (like number and timestamp), so I think we should have a description for each indicating where it gets its default from. Also, I would really like to see it indicated that these are optional somehow, ideally in the schema itself, but alternatively in a description field or comment of some kind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still a concern now that there's one set of block overrides for the whole batch?
I think these are still relevant. If you are building off of block n
, then you cannot have a number
that is less than or equal to parent.number
. Similarly, you cannot have a timestamp that is less than or equal to the parent.timestamp
. Similar restriction on gas limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: On Gas limit there may be value in expressly allowing changes that are against consensus rules so that people can do series of calls that are more than the current gas limit.
CallResult: | ||
title: Result of a call | ||
type: object | ||
required: | ||
- return | ||
properties: | ||
return: | ||
title: Return data | ||
$ref: '#/components/schemas/bytes' | ||
error: | ||
title: Execution error | ||
type: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The result for each transaction should include logs and gas spent as well. I suspect that all clients are generating this information anyway, and throwing it away is a waste when we could just return it all. In many cases, this will also make it so the caller doesn't need to do multiple round trips (e.g., once for gas estimate and again for speculative execution and again to read a deterministic event like contract creation details).
If the call is a contract creation (to: null
), it should also return the address of the created contract for the same reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
include logs and gas spent as well
@s1na mentioned an argument against including gas spent in the return interface (here). The intention I mentioned was to use the gas spent as is for gas limits, which I agree would fail when tx. However, including gas spent in the response could still be useful if utilized with care (passing a factor of more gas for e.g. to account for 63/64 gas passed to child msg calls). And in the documentation, a note can be mentioned that gas spent is not accurate to be used as gas limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally don't think it is worth trying to optimize the edge case where someone does if (GAS) ...
because that is very rare and when it is used, the dapp almost always explicitly sets the GAS required anyway so the returned gas used won't matter.
I think the value add of returning gas used far outweighs the downsides of it being wrong in a handful of esoteric scenarios. I do recommend that we standardize on the amount of gas provided to each call in eth_multicall
though. Perhaps block.gas_limit - gas_used_in_previous_transactions_in_multicall
? Essentially, assume that the whole set of transactions is intended to fit inside a block (this also provides a sanity check on total gas used) and they "burn through" the block's gas as they are executed.
This would prevent people from using this to do mega gas operations, but I think we can address that by simply having eth_multicall
accept a gas_limit
parameter for the block
and each transaction
so this isn't a hard limiter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think gasUsed
should not be used as an estimate. But I can sympathise with adding this info to the call. It comes at no overhead. Right now all this can be fetched in one call through geth's tracing API (via the callTracer
), but I agree it should be part of eth_
. I'm happy to add these.
Re gas_limit
there is a need for a block gas limit set in the client itself. Because providers like Infura will want to cap the runtime of API methods. Geth has both a gas limit and a deadline in seconds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting an upper bound on the gas able to be used by an eth_call
or eth_multicall
feels like it is out of scope of this schema. On my local node, I should be able to do calls that use trillions of gas, and the fact that a centralized service (which we should not be catering our designs toward) needs rate limiting should not negatively impact a self-hosted operator from doing whatever they want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify: what I meant was there is a CLI flag anyway that will set a limit for all eth_call
s and eth_multicall
s. On your local machine you're free to set it to trillions. Question is whether a per-invocation limit is also useful.
This would prevent people from using this to do mega gas operations,
Adding a gasLimit
parameter won't solve this for public facing nodes. They will simply set a high gas limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify: what I meant was there is a CLI flag anyway that will set a limit for all
eth_call
s andeth_multicall
s. On your local machine you're free to set it to trillions. Question is whether a per-invocation limit is also useful.
Such a setting feels like it is out of scope of this standard? I might be able to be convinced that it is relevant, but rate limiting features and DoS protection generally aren't standardized.
What we need to standardize is "how much gas is given when the parameter is missing" and picking something "reasonable" (e.g., not 2^256, and not 30,000) as a default feels like the right way to go to me. If a particular client has limits that are lower than that (such as from configuration) then they should return an appropriate rate limiting error and require the user to explicitly set the gas limits in their requests I think if the user exceeds those limits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the if (GAS)
case is not really worth considering but the 63/64 and the refunds are.
However, there's really two uses for gas that should be handled:
1.) what you need to add to transaction for it to succeed (eth_estimateGas/eth_batchEstimateGas)
2.) What you actually pay for, some way to determine what my EOA is actually going to be charged for the transaction. It feels most natural for that to occur in the eth_batchCall . If there is a back-run, it is very important to the sender how much gas is going to actually be charged as mev profit can often be extremely small and the difference between the actual gas used and the gas estimation is the difference between profit and loss.
Sorry for the letting the ball drop on this for a while. I'm gonna begin to address the reviews. To start:
Will address the rest of reviews on their respective comments. |
This reverts commit 9861fd8.
title: Gas limit | ||
$ref: '#/components/schemas/uint64' | ||
coinbase: | ||
title: Coinbase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
title: Coinbase | |
title: FeeRecipient |
I believe fee recipient
is the nomenclature we are using now (as of The Merge) for the person who received the transaction fees in the block. This is separate from proposer, which I don't think the execute layer has access to.
I think there would be value in making it so My initial thinking is to have something like: {
parent_block,
blocks: [
{ block_overrides, transactions },
{ block_overrides, transactions },
...
]
} Everything else would stay the same, and the first block in the array would be built on top of the looked up |
Also it's worth discussing a deprecation policy for API methods. Multicall is more general than call and if adopted will fully supersede |
I think multiblock/chain execution is a useful feature. However after reading comments such as ethereum/go-ethereum#25743 (comment) and thinking more about this, I've made up my mind and would like to champion only the single-block-multi-call variant at this time. How I envision the |
I believe the referenced comment by @holiman is indicating that having multiple transactions with different block parameters could have sharp edges. A multi-block multicall however wouldn't have such sharp edges.
Wouldn't the difference between single block and multiblock (for a multicall that only actually spans one block) just be a couple extra sets of braces and a property name more or less? If we use a tuple you don't even need the property name (though I don't recommend that). { block, stateOverrides, blockOverrides, calls: [transaction1, transaction2, ..., transactionN } vs { block, blocks: [ { stateOverrides, blockOverrides, calls: [transaction1, transaction2, ..., transactionN } ] } It would be unfortunate to need a separate JSON-RPC method for multiblock call when the difference between them is so minimal, and the work required to implement them both is likely very similar, especially since single block multicall is purely a subset of multi block multicall. |
I'm pretty sure his later comment applies to multi-block. There is a difference between simply executing multiple txes and executing txes from different blocks (doesn't matter if they are ordered blocks). Namely, it is the coinbase reward at the moment. Soon it will also include the withdrawal logic. Also once you have multiple blocks you need to get
No because once you have multiple blocks the override fields become mandatory. Unless you special-case it for one block, but that's ugly IMO. |
Why would you require block overrides for multiple blocks but not for single block? It seems to me that you either need to require overrides in both cases, or they are optional in both cases. Can you elaborate on why you think these situations are different?
It is certainly possible that there are some implementation detail in some client that make building multiple blocks in a row outside of current chain particularly difficult, but that feels unlikely to me (would be great to get a concrete answer from client devs on this). In Nethermind at least, I believe it would not be notably harder to build a two-block chain in memory vs a one-block chain in memory. |
Can this be named something other than multicall as that is confusing , maybe |
@MicahZoltu are you willing to champion the multi-block variant as a separate proposal? we can raise them both during ACD as alternatives. I'd be happy if either of them gets adopted and can take care of the implementation in Geth. I'm all for the extra flexibility of the multi-block variant, but it seems to me that the use-case is more niche and I don't want the extra complexity to delay adoption.
For a single block you can simply take the parameters of
@sambacha I would go with |
What do you find confusing about |
Given how difficult it is to get new JSON-RPC methods standardized, I think that we should take our time and spec this out in a way that we think is best long term for the ecosystem, rather than trying to do the thing that requires the least amount of work. I say this as someone who is currently working on a project that has a hard dependency on the single-block version of this too, and I don't benefit directly at all from the multi-block version!
I agree with this, but it doesn't seem to answer my question of why multi-block requires overrides? Am I misunderstanding something, because it sounds like you are saying that multiblock does not require overrides? Note: Each successive block should increment the time by 12 seconds and the block number by 1 by default. This includes the first block built on the real chain, and the subsequent blocks. They should keep the gas limit unchanged from the parent, and probably just include a a deterministic "random" number for randao (which IIRC is now in the block header) like hash the parent's randao or something. This should be written into the standard (regardless of whether we do multi-block or single-block).
I would much rather go to ACD with a single proposal than two competing proposals. We have seen in the past how two competing proposals both being championed can indefinitely delay inclusion (e.g., BLS), so generally speaking it is best to get agreement among champions first. That being said, perhaps we could go to ACD with the multiblock proposal but mention that if any of the clients believe multiblock will be particularly difficult to implement then that is something that can be pulled out of the proposal? As I mentioned above, my suspicion is that they are nearly equally difficult to implement but if one or more client teams indicate that their architecture makes multiblock hard that is the most likely thing to convince me to back off. |
I want to test this extension on my geth but, how do I call the JSON-RPC I couldn't figure. Can you give an example please? |
I would like to see a standardized batch call method implemented into geth upstream and of the available proposals, this one seems like the best and even has an implementation with an open PR (ethereum/go-ethereum#25743) that is only blocked by the spec agreement. There have also been attempts to put eth_callBundle (Flashbots-specific non-standard API I helped create) but i would like to see something more flexible and useful across multiple blocks with features for security researchers and end users (wallets) as well. If eth_batchCall (or similar) is upstreamed into geth, i think Flashbots is likely to switch to it and recommend its use in their libraries and documentation. I woud like to see this API used to prevent Ethereum users from being scammed by tokens that only allow purchase (and revert on sell). An API like this could verify the ability to re-sell tokens and automatically protect users. @s1na i see there was some correspondence here but that after a few comments, momentum was lost, are you open to continuing this discussion and helping get this across the line? I would be happy to jump on a call to hash things out or add more resources as necessary, i think this is important work and not too far from being completed. |
Summary
This PR adds a new method to the
eth
namespace which performs a sequence of call messages, each building on the previous call's state. It will use a given block tag as base state similar toeth_call
. However this state can be overridden by providing a map of addresses to accounts.Motivation
#267 already describes the motivation a bit. Further, this is a demanded feature in geth (see ethereum/go-ethereum#24089).
Rationale
debug_traceCall
method in geth now supports state and block overrides which couldn't be added toeth_call
as they'd be a breaking change (assuming there would be consensus to add these fields). See next 2 points for possible future additions.tx_idx
to get the same effect.Implementation
ethereum/go-ethereum#25743
Notes
The storage schema doesn't validate the fact that keys can be only 256bit since JSON supports only string keys for objects.
Todos
Alternative to #272